Variational Integration for Speech Signal Processing

نویسندگان

  • Max A. Little
  • Irene M. Moroz
  • Patrick E. McSharry
  • Stephen J. Roberts
چکیده

Voice production is generally modelled as a two-component dynamical process composed of the vocal folds and vocal tract. Figure 1 shows a diagram of the arrangement of the vocal fold and vocal tract inside the head and neck. The vocal tract is comprised of the pharyngeal, oral and nasal cavities. It is usually modelled as a linear acoustic resonator, and the vocal folds as a nonlinear dynamical system comprising masses, viscoelastic damping and forcing due to lung pressure. However, it is generally the case that systems used for speech transmission, analysis or compression do not utilise an explicit, dynamical model of the vocal folds. They take several different approaches (Kleijn & Paliwal 1995), including: (a) waveform coderswith no model of speech production, (b) source coderswhich use a vocal tract model and a simple characterisation of the vocal fold behaviour, i.e. whether it is periodic or noise-like, and (c) hybrid methods with a vocal tract model and selection of a representation of the vocal fold behaviour that minimises overall waveform error. In this paper we introduce a method for modelling the dynamical behaviour of the vocal folds in speech processing. This method is based around a discrete dynamical model that is suitable for direct fitting to the vocal fold signal. Thus parameters that represent the biomechanical behaviour of the vocal folds can be identified. These parameters, together with the initial conditions and the model residual are an exact but smaller representation, in the information-theoretic sense, of the vocal fold dynamics. This representation could then, for example, form the basis for a low bit-rate source coder.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparative Study of Gender and Age Classification in Speech Signals

Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...

متن کامل

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract   Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

A New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)

Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...

متن کامل

A Novel Frequency Domain Linearly Constrained Minimum Variance Filter for Speech Enhancement

A reliable speech enhancement method is important for speech applications as a pre-processing step to improve their overall performance. In this paper, we propose a novel frequency domain method for single channel speech enhancement. Conventional frequency domain methods usually neglect the correlation between neighboring time-frequency components of the signals. In the proposed method, we take...

متن کامل

Speech Enhancement using Adaptive Data-Based Dictionary Learning

In this paper, a speech enhancement method based on sparse representation of data frames has been presented. Speech enhancement is one of the most applicable areas in different signal processing fields. The objective of a speech enhancement system is improvement of either intelligibility or quality of the speech signals. This process is carried out using the speech signal processing techniques ...

متن کامل

Hartley Series Direct Method for Variational Problems

The computational method based on using the operational matrix of anorthogonal function for solving variational problems is computeroriented. In this approach, a truncated Hartley series together withthe operational matrix of integration and integration of the crossproduct of two cas vectors are used for finding the solution ofvariational problems. Two illustrative...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004